Learning Distributed Word Representations For Bidirectional LSTM Recurrent Neural Network
نویسندگان
چکیده
Bidirectional long short-term memory (BLSTM) recurrent neural network (RNN) has been successfully applied in many tagging tasks. BLSTM-RNN relies on the distributed representation of words, which implies that the former can be futhermore improved through learning the latter better. In this work, we propose a novel approach to learn distributed word representations by training BLSTM-RNN on a specially designed task which only relies on unlabeled data. Our experimental results show that the proposed approach learns useful distributed word representations, as the trained representations significantly elevate the performance of BLSTM-RNN on three tagging tasks: part-ofspeech tagging, chunking and named entity recognition, surpassing word representations trained by other published methods.
منابع مشابه
Gated Word-Character Recurrent Language Model
We introduce a recurrent neural network language model (RNN-LM) with long shortterm memory (LSTM) units that utilizes both character-level and word-level inputs. Our model has a gate that adaptively finds the optimal mixture of the character-level and wordlevel inputs. The gate creates the final vector representation of a word by combining two distinct representations of the word. The character...
متن کاملA Unified Tagging Solution: Bidirectional LSTM Recurrent Neural Network with Word Embedding
Bidirectional Long Short-Term Memory Recurrent Neural Network (BLSTMRNN) has been shown to be very effective for modeling and predicting sequential data, e.g. speech utterances or handwritten documents. In this study, we propose to use BLSTM-RNN for a unified tagging solution that can be applied to various tagging tasks including partof-speech tagging, chunking and named entity recognition. Ins...
متن کاملDeep Stacked Bidirectional and Unidirectional LSTM Recurrent Neural Network for Network-wide Traffic Speed Prediction
Short-term traffic forecasting based on deep learning methods, especially long-term short memory (LSTM) neural networks, received much attention in recent years. However, the potential of deep learning methods is far from being fully exploited in terms of the depth of the architecture, the spatial scale of the prediction area, and the prediction power of spatial-temporal data. In this paper, a ...
متن کاملLearning hypernymy in distributed word vectors via a stacked LSTM network
We aim to learn hypernymy present in distributed word representations using a deep LSTM neural network. We hypothesize that the semantic information of hypernymy is distributed differently across the components of the hyponym and hypernym vectors for varying examples of hypernymy. We use an LSTM cell with a replacement gate to adjust the state of the network as different examples of hypernymy a...
متن کاملRecurrent neural networks with specialized word embeddings for health-domain named-entity recognition
BACKGROUND Previous state-of-the-art systems on Drug Name Recognition (DNR) and Clinical Concept Extraction (CCE) have focused on a combination of text "feature engineering" and conventional machine learning algorithms such as conditional random fields and support vector machines. However, developing good features is inherently heavily time-consuming. Conversely, more modern machine learning ap...
متن کامل